PI: Dr. Thomas Girke, professor at Department of Botany and Plant Sciences, University of California, Riverside (thomas.girke@ucr.edu)
Maintainer: Jianhai Zhang (jzhan067@ucr.edu; zhang.jianhai@hotmail.com)
The R/Bioconductor package spatialHeatmap is designed for intuitive visualisation of large-scale data as long as a pair of configured data matrix and SVG image are provided. This tutorial is specifically devised for how to make an SVG image and configure it with the data matrix step by step.
To make a custom SVG image, a png image where regions should have clear contours, a data matrix, the SVG editor Inkscape, and optionally the image editor GIMP are required. The png image is the template while the data matrix is used to colour different regions in the SVG image. Inkscape is used to associate the SVG image with the data matrix and GIMP is optionally used to automatically extract polygons for the SVG image.
In the following, the tutorial is given with a pair of configured gene expression matrix and SVG image of root tissues. All the files used in this tutorial can be downloaded here (download an SVG image: hover over the image, right click, and select “Save image as…”; download a PNG image: click the image, click “Download”, right click, and select “Save image as…”; download a TXT file: click the file, click “Raw”, right click, and select “Save as…”.).
The gene expression matrix should be normalised and filtered before downstream processing. In this tutorial the normalisation process is not covered. The function filter.data (Morgan et al. 2018; Dowle and Srinivasan 2018; R Core Team 2018) has two filter arguments pOA and CV, corresponding to pOverA and cv in the package “genefilter” (Gentleman et al. 2018), respectively.
In the gene expression matrix, the row and column names should be gene IDs and sample/conditions respectively. The sample/condition names MUST be fomatted this way: a sample name is followed by double underscore then the condition, such as “epidermis__140mM_1h" in Table 1 (Geng et al. 2013), where epidermis is the sample and 140mM_1h is the condition. In the column names of sample/condition, only letters, digits, single underscore, single space, or dots are allowed. Not all samples in the matrix necessarily need to be present in the SVG image, vice versa. Only samples present in the SVG image are recognised and coloured.
The expression matrix is stored as an “SummarizedExperiment” object. Metadata of genes and sample/conditions can be optionally added. Refer to the R package “SummarizedExperiment” for more details (Morgan et al. 2018).
library(spatialHeatmap); library(data.table); library(SummarizedExperiment)
# Creat the "SummarizedExperiment" class.
## The expression matrix, where the row and column names should be gene IDs and sample/conditions, respectively.
data.path <- system.file("extdata/example", "root_expr_row_gen.txt", package = "spatialHeatmap")
expr <- fread(data.path, sep='\t', header=TRUE, fill=TRUE)
col.na <- colnames(expr)[-ncol(expr)]; row.na <- as.data.frame(expr[, 1])[, 1]
expr <- as.matrix(as.data.frame(expr, stringsAsFactors=FALSE)[, -1])
rownames(expr) <- row.na; colnames(expr) <- col.na
con.path <- system.file("extdata/example", "root_con.txt", package = "spatialHeatmap")
## Condition is a single column data frame.
con <- read.table(con.path, header=TRUE, row.names=NULL, sep='\t', stringsAsFactors=FALSE)
ann.path <- system.file("extdata/example", "root_ann.txt", package = "spatialHeatmap")
## Gene annotation is a single column data frame.
ann <- read.table(ann.path, header=TRUE, row.names=1, sep='\t', stringsAsFactors=FALSE)
## The expression matrix, gene annotation, and condition are stored in a "SummarizedExperiment" object. Gene annotation and condition are optional.
expr <- SummarizedExperiment(assays=list(expr=expr), rowData=ann, colData=con)
# Filter genes. In "pOA", genes with expression value A >= 1 in at least p=0.03 (3%) of all samples are retained; in "CV", genes with coefficient of variance (cv) between 0.1 and 10000 are retained, where the upper limit is set to very high (10000) so as to keep all genes with cv over 0.1.
exp <- filter.data(data=expr, pOA=c(1, 0.03), CV=c(0.1, 10000), dir=NULL)
# Get the filtered matrix. "filter.data" returns a "SummarizedExperiment" object.
df <- assay(exp)
| epidermis__standard_1h | epidermis__140mM_1h | epidermis__140mM_3h | epidermis__140mM_8h | |
|---|---|---|---|---|
| PSAC | 2.944807 | 2.457910 | 2.862155 | 2.313841 |
| NDHG | 4.243482 | 4.072965 | 4.154074 | 4.935179 |
| PETG | 4.830398 | 5.516260 | 5.390418 | 5.372507 |
If the contour in the png image is not clear, GIMP can generate low-quality SVG images, so in this case one can draw the blank SVG image with Inkscape by using the png as a template. Below is an example of drawing only two polygons.
Open the root png image (Mustroph et al. 2009) in Inkscape. The image can be zoomed by press “-” or “+” on the keyboard. Select the “Draw freeahnd lines (F6)” at the left tool bar. Left click once at the first corner of the polygon, move to the second corner and double left click, and so on. Lastly, when drawing the last line click at the first corner to seal the polygon.
Select “Edit path by nodes (F2)” from the left tool bar. Draw a large rectangle to select the whole sealed polygon and draw another large rectangle to select all nodes.
Click the “Make selected nodes smooth” in the top tool bar. Drag the edges and handles to make the sealed polygon aligned with the template, then the first polygon is finished.
Alternatively, the polygons can be made with rectangles. Click “Create rectangles and squares (F4)” at the left tool bar and draw a rectangle in the second template polygon. Select the rectangle and click “Object to Path” under the “Path” tab at the top, then the rectangle becomes an SVG path, which can be edited. Switch cursor to “Select and Transform Objects (F1)” and rotate the rectangle as expected.
Switch the cursor to “Edit path by nodes (F2)” and select all nodes of the rectangle. Click “Make selected nodes corner” at the top tool bar. Drag the edges and handles to overlay the rectangle path on the template, then the second polygon is finished.
In this root image there are many polygons, so it takes too much time to draw them individually. However, GIMP can be used to extract the polygons automatically.
The drawing method creates accurate SVG images but it is time-consuming, while the GIMP method is faster but it can generate fused polygons. In this tutorial, the root png image has well-separated polygons and clear contours, so GIMP can produce accurate SVG images. Otherwise the resulting SVG images would have mixed and noise polygons. Considering the pros and cons of two methods, the good practice is to first use GIMP to extract polygons then use drawing method to refine them. Below is an example of an SVG image with fused polygons, which is generated by GIMP.
Open the blank fused SVG image in Inkscape (the image can be zoomed by press “-” or “+” on the keyboard). Under the “Layers” tab at the top click “Layers”, the “Layers” panel will come out on the right, then click “+” at the bottom left corner of the panel to add “Layer 1”.
Draw a rectangle over the fused SVG graph and cut. Then click on “Layer 1” and paste the fused SVG into “Layer 1” (make sure the “Layer 1” is unlocked by refering to the lock symbol). Open the “XML Editor” from the “Edit” tab at the top, if “<svg:path id=”path 77“>” is under “<svg:g id=”layer1" inkscape:label=“Layer 1”>“, then the fused SVG is inside”Layer 1“.
Draw a ractangle over the image, click “Break Apart” under the “Path” tab, then select the outer noisy rectangle by clicking on its edge. Press “delete” on the keyboard to delete it.
Click the edge of the large fused polygon and move. Use the [drawing method](#draw) to make new polygons (blue) with the fused ones as templates. Delete the fused polygons and move back the new polygons to make the tissue complete. <p/>
If a large fused polygon needs to be separated, one can use the eraser tool. Drag the fused polygon away from the tissue, select the eraser tool from the left tool bar, then use the eraser to cut the fused polygon into three independent polygons. Select the cut polygons and click “Break Apart” under the “Path” tab at the top, then the three polygons are separated. Place back the three polygons to make the tissue complete.
Under “Object” tab at the top, select “Fill and Stroke…”, then the “Fill and Stroke (Shift+Ctrl+F)” panel will come out on the right. Select all polygons by drawing a large rectangle over them.
Under the “Stroke paint” tab in the fill and stroke panel, select “Flat color”. Under the “Stroke style” tab, set the stroke width, e.g.: 1.5 px.
Under the “Fill” tab, click “No paint” to get a blank SVG image, which is ready to use in next section.
In the SVG image, each polygon has a unique ID. To plot spatial heatmaps, these IDs should be exactly replaced with sample names in the gene matrix. No matter how the blank SVG image is created, it should be placed inside a layer before start the following steps.
If multiple polygons belong to the same tissue type, they should be grouped together. The example of grouping epidermis is given below. Open “XML Editor…” under the “Edit” tab at the top, then the “XML Editor (Shift+Ctrl+X)” panel comes out on the right. Click all the epidermis polygons while pressing the “Shift” key, right click, and select “Group”. A group should not contain another group.
The epidermis group <svg:g id=“987”> shows up in the XML panel. Click “id”, change “g987” to “epidermis”, and click “Set”, then the new group id is set.
In the “Fill and Stroke (Shift+Ctrl+F)” panel, select “Flat color” under the “Fill” tag, then specify a color for epidermis.
Group other tissues, set ids and colours.
The polygons are stacked over each other according to their orders in the “XML Editor”, so the first polygon might be invisible because it could be covered by the second, third, and so on. For instance, in the brain SVG image (anatomybodysystem.com 2017; epilepsyresearch 2017), the grey outline polygon (the path “rect5480”) is the first one, and partially covered by other polygons. Therefore, users should drag and organise the paths in expected order.
Users can add text to label tissues. Basically, the text is first typed in with the text tool and then the text object is coverted to paths. Next paths are added into the polygon group of target tissue and filled with the same tissue colour. Below is the example of adding text to the epidermis tissue.
Select “Creat and select text objects (F8)” from the left tool bar, drag a text box, and type epidermis. Click on the text object and convert it to path.
Click on the text paths and fill them with the same colour of epidermis using “Pick colors from image (F7)” from the left tool bar. In the “Fill and Stroke” panel, set the stroke style.
Click and cut the text, then double click epidermis (green) to enter the group.
Paste the text anywhere then the text is inside the epidermis group as a “text group”. Move/resize the text group as expected. Right click and ungroup the text.
All the letters are in the epidermis group as individual paths, which can be seen in the XML Editor.
To add a pointer, draw a rectangle and convert it to path, fill it with the same style as epidermis. Move, rotate and resize it as expected.
Clicking any of other tissues will group the text, pointer, and epidermis together, which can be confirmed by dragging epidermis to see they are moving as a whole.
Similarly, add text to label other tissues.
anatomybodysystem.com. 2017. “HEAD ANATOMY.” http://anatomybodysystem.com/lateral-view-of-the-brain-labeled/lateral-view-of-the-brain-labeled-brain-diagram-and-label-anatomy-body-list/.
Dowle, Matt, and Arun Srinivasan. 2018. Data.table: Extension of ‘Data.frame‘. https://CRAN.R-project.org/package=data.table.
epilepsyresearch. 2017. “The Hippocampus: What Is It?” https://www.epilepsyresearch.org.uk/the-hippocampus-what-is-it/.
Geng, Yu, Rui Wu, Choon Wei Wee, Fei Xie, Xueliang Wei, Penny Mei Yeen Chan, Cliff Tham, Lina Duan, and José R Dinneny. 2013. “A Spatio-Temporal Understanding of Growth Regulation During the Salt Stress Response in Arabidopsis.” Plant Cell 25 (6): 2132–54.
Gentleman, R, V Carey, W Huber, and F Hahne. 2018. “Genefilter: Methods for Filtering Genes from High-Throughput Experiments.” http://bioconductor.uib.no/2.7/bioc/html/genefilter.html.
Morgan, Martin, Valerie Obenchain, Jim Hester, and Hervé Pagès. 2018. SummarizedExperiment: SummarizedExperiment Container.
Mustroph, Angelika, M Eugenia Zanetti, Charles J H Jang, Hans E Holtan, Peter P Repetti, David W Galbraith, Thomas Girke, and Julia Bailey-Serres. 2009. “Profiling Translatomes of Discrete Cell Populations Resolves Altered Cellular Priorities During Hypoxia in Arabidopsis.” Proc Natl Acad Sci U S A 106 (44): 18843–8.
R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.